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I niMO-m/nlilr Sets of NlltrifHT* 

Makyiv Mimky anb Sevuouh Pai'kict* 
MaiwkufitUt JnrtiUih #J TVrAWnyn, C«a»I» frf^r , \t<t*AOt>SntwlM 

Ab&rvcl. Wli« tH h *l ,^ (if jnailive iii^tfw, repreaehtcd j* hitini) mimbrrs, *?egulur # * in 
thfl wnae lhal it ;• a wl of sequence* ihsl can b* teeogniiird hy u linilc slAln machine? Let 
**W be ihe number of meml>er* of .1 1** Lbui the integer it. It i* ihnwn ilial the a*ympttttk 
behVrlor of **<n) te subject to severe restraint* if A U regular. TU*c CAtntininM are violated 
hy many itripurlnnl natural numerical f«1i whtwc duurihuiiim funeliore enn be ivalculaled. At 
least ssymplotlcaJtv, The*e include the set /* u{ primr number* for whieh *Hn) « "/log * 'or 
lArff»>the*ei of inl*Kert ^(ilc^tWform^rfor which « 4 #,(«) « n* J . Ami runy other*. The 
technique cannot, however, vipfrJ a decision pr*icedure f«r rrguWHy since fur every infinite 
rtful&r wi A theire is a rtonregulAr set .I'for which J ir A (n> — *j*(n) I £ 1, so that tin* asymp- 
totic behaviors of the two distribution function* nrc *mdi tally idi*nticn) 

1* Intr&htrtion 

Let A be some set of positive integers mitten ii> binary notation* It is natural to 
ask what kind of computing machine could t-cratintfc (!) tin* set in the sense of de- 
ciding whether * given binary sequence reprv**ents a number belonging to A. The 
technique described in this note enables on* to show that certain sets cannot be 
recognized by finite state automata (i-ff., these sets are not **n^iliir" [?!). The 
essential it leu in this: Let s^In) be the number of members of A lex* than the integer 
w. Il is shown that the asymptotic behavior of 9*tn} i* subject to severe restraints 
if A is regular. These constraint* an* violated by many import an l rial ural numerical 
sets whose dwtribntton functions can be calculated, nt lea*t asymptotically. These 
include the set J* of prime numbers, for which *>(*) « »/log n for large a, the set of 
integers A(k) of the form «\ for which ffj(*>(n) « » lJ \ ami many others. The tech- 
nique cannot, however, yield a decision procedure for regularity, since for every 
infinite regular set A therrisaiionregularsei ^4'for which) **(n) - w A * fn)| < L 
so that the asymptotic behaviors of the two dUtributron functions are essentially 
identical. 

We consider here only the binary representation, so as to avoid pompous state- 
ments, but the same results can he obtained for any radix by changing all 2V 1o r's 
in the sequel. We warn renders not 1o confuse the statement that the prime-s written 
in binary form are not a regular set with the trivial statement that the set of string* 
of prime length is not regular. 

(1) Consider the set of strings of O's and l's of which the first symbol is a I. i*e. ( 
iV - 1(0 V 1)*, using Kleene's notation (2J. 

(2) Such string* are regarded ambiguously as integers to the bUH 2 or as strings 
of O's and IV. Numbers are presented to the machine high digits first. (This con- 
vention is innocuous since the set of reversed strings of a regular set is regular.) 

* Added in proof: TV autW» hay* learned thai Alan Cohham ubtnlned snl^tanlidly the 
AAine result* at about the same linw. 

f Both tit Department «f ErcctrieAl Rnrinevrifif; ond Projcd MAC* 
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(3) For any integer x, J^(x) islhenumbwof digiuin^i-e*,^ 1 " 1 <x <2 U, \ 

(4) Dcfinox-y- 2 w * + »;th*t ^"■"wwwnitwiilioii. 

(5) A always denotes a set of positive integers ami r A (n) the cardinality of 
A Ofjll <*<*)■ 

(6) If ,4 is KfiW. let SI A denote the reduced (Ke^ minimal) automaton which 
recognises -4. In the discussion of any particular automaton Hf t Qt& used for it* set o( 
Mate**, ^ for its initial state. Or for its set of "final slates" (whose occurrence signifies 
acceptance of a string), and i(q y x) for the state 'transit ion function, \.t. t the 
sentience x drives the automaton from state q to the state 6(q, x)_ 

(7) A dead *tale is a state 7 such that Hq> x) € Oj is satisfied by no x, (If the 
automaton Lsraiuc?edthi*iKe<iuivalent to saying l{q t x) - q for all *, since there is 
only dead slate in a minimal machine.) 

(S) We are interested in subsei* of X = 1(0 V 1 >* rather than in subsets of 
(0 V 1)** We consider as trivial the part of the automaton which merely verifies 
that the input sequence begin* with a I. We shall depart from the assumption of 
minimality by allowing a special dead state into which the- machine is driven by an 
initial zero. In the sequel, "dead slate" means dead state other than this special o:k'. 

(9) For convenience the following convention is adopted: For any **el A t and any 
rted number J, »j(x) = # 4 (|xl), where \r] is the integral part of z. Where r A (n) is a 
"natural'* function, such a* fi/fegii, x/logx will be used as an approximation of 
M/log [x] for very Urge values of x. 

3* Theorem* 

First, the consequences of jV having dead states will be shown. 
Proposition 1, Lei M - M A and suppose; 
(a) &(q^a) is a dead «fa/e, 

(6) Xo = ?+J, a «rf 
a 



(c) Km 



r.(n) 



Then $ - 1. 

Proof. Put n* - 2"a. Then hru - 2*(<* + 1 ). By assumption (a), there i* no 
dsuch that a-fl C A* In other words, no matter what m is chosen, there is no a i A 
such that 

n~ = 2"<* < n < 2"o + 2" - X<*„ t 

where m — LIS)* 

TbmTA.{nn)/TA(k&m) - 1, because there are no member* of A between ru and 
M* ■ But \T A tnm)/*A*&m)\ » an infinite subsequence of the sequence, supposed 
by assumption (c) to be convergent, \r A (n)!w A (>^)\- It follows tlutt the limit, $ t 
of this sequence (if the limit exists) is L 

The underlying fact, then, ts that dead states produce targe gaps in the sequenced 
numbers recognized by a machine. Below, it is shown that if there is no dead state, 
the gaps cannot grow in the same manner. 

We immediately deduce the weakened, but easier to-use: 

PaorostTioN 2. // (v A (n}/v A (\n)) -* 0(X) for alt reo/ X and if 0(\) ■■ 1 only 
i/ X = l t then A rmrvrf be a regular **l tr/wm reduced automaton Aos a dead * 



UXRECOflMSAllLE *FT5 OF NUMBERS 

In some interesting cases (we below), *(»)/*( Xn) fails to converge. We can still 
sometimes use a sharper but less; elegant criterion: 

Proposition 3. Let a, ht the rth member of A in order of magnitude* Then if 

limfr* 1 "*-^ 

A eonnot be a regular set whose reduced automaton hot a deatl stole. 

Proof. Suppose that A is a regular set whose reduced automaton ha* a dead 
Mate, and let a. Xv.&nd n- be defined as in 1 he proof of Proposition 1. Denote by (*•. 
the number of member* of A i alter thatttt* , i.e., A* tMhclunz&Kt inlegersucli that 
<ij n < *•* * Since no member of ,1 can lie between tu and \»»„ t we have: 

Thus 

It follows that (d., 1 — «.)/a, cannot converge toO»r^ », 
Finally, we look jit (he other side of tlie coin; what happens if M has no dead 

slate? 
Proposition 4. // A is regular and tf A has no droit state, then * A (n)fn > 2^**, 

where A* is the number of elate* of M A , Thus the density of A cannot converge to zero. 
Proof. For eaeh integer / we *ha1) define a 1 -1 into map gz 

where [a, 6) denotes the interval a < J < b. 

Forany *e|2*', 2* M, ) ( Iet fl be the smallest integer Tor which «-0€ A. Such afi 
mu*l exist because &[q v t a) is not dead. Moreover, $ < 2* because the shortest path 
from a(gs> , a) to a member of Gr cannot be longer than A*- 1 . It follows that 
a-3 < 2r'** #o that a € \2 V \ 2*"*) f\ A, Thus if we define 0(a)- 
a-min |fl|<r-0€ /i|.^hiu<the required range- To see that it is 1-1, we simply note 
that ^ is recoverable as the firsl A*' digits of £ ! <> ) ■ 

It follow* that [2* f t 2"'+*) fl d contains nt lejwt iwitiany meml«rrsas |2", 2 mi ). 
Therefore. ^(2****) £ 2*'* 1 - 2*' = 2"'. Now consider an arbitrary number w. 
For some f, n € |2 W \ 2""' f * vn ) 1 and since ir 4 {2) increases monoionically, 

**(*) > wAST) > 2^ „ 

Combining this result with the consequences of Propositions 2 and 3 leads 
to the following Criterion* 

On I'm tios . 7*o prove that a set t A % is i\ot regular* it is sufficient to verify Condi- 
tion 1 and Condition 2 or 2\ 

Condition I. wAn)fn — as n — • «. 

CoRtf&wa 2* vj(it)/r.«(An) — 0(A) as * -* *, and ?{X) 7* 1 for all \ * 1. 

CcmdAiofi 2'* (g**,i — o.)/Oti — * as ra — * *&* 
If A is regular, by Proposition 1 it has a dead state but by Proposition 2 or 3 it 
ha* none. 
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4. Applic&itm* 

First, some example are discussed that can be settled using the Criterion of 
Section 3. 
E/Qmptt I* A — |j« { k a fixed integer). Clearly **{n) « r*'* so that : 

(1) ^(ftl/tt-n'^^Oaan— *, 

(2) *An)/*A*n) — y vk p* I for all X p* L 
Therefore A is not regular* 1 

Eramplt 2. Let P be the act of prime numbers. It is well known that t>(it) « 
w/log ft* Thu* *>(tt)/n w 1/log a — + as n — * «, satisfying Condition I. On the 
other hand, 

Z^> « !!)£*_* = «3t*+lSI»> -+J * 1 <forX H .). 
ry.\n\ \n Jog »> X log n X 

Aftain, Conditions J and 2 are sat£*fi6d;»therefore, P is not regular. 

Example 3, Let B be the set of all prime powers, i.e.» B = |p" | p prime, m ait 
integer). Write if as: 

£ - Hi U B* U - - * U A* U << < , where B* - (p* | p prime). 

Then for each k we have, exactly, **n(n) - trffl 111 ). 

To compute r*(n), fin*t note that if n« € B, it, < *; then for exactly on* t and 
p, n« ■ p* < *,«■ that: 



*■!•] 



k < loa>, it £ log*«, U-, No€ Bi U ■<« U B ( 
It follows that 

*#(*) - w*(*) + ■■■ + ttihciOt) e *><*> + W) + ■" + aV" 1 * 1 *. 
Thus 



Since 



we have 



,00 - " +i gL + ... + N & y h 

log, ft Jog, n log, tt 



2*' + •-• + llogt»K ,n- ^* 1 < 2(Iogt«)V, 

(logt a) T « ft 

— p »0 aa n-**. 

pOC« 

*-»(«* « it/log, tl « ffr(«)* 



Thus the set B has the same asymptotic density as P and U not regular. A similar 
argument shows that the set {n* | n, m integers, hi > 2| is not regular* 

Example 4. We now illustrate the use of the Criterion in eases where the ratio 
»*(*»)/*4<\tt) fails to converge. Let C be the set of binary palindromes, i.e., se- 
quences invariant under reversal. To show that C is not regular, first- verify that 
Condition 1 is satisfied. This is easy; since the first half of the binary digit* of an 
n-digjt palindromic number is determined by the last half, neglecting the small odd- 

* This application indudoi the rc*u]t pttirod by Ritchie R I by &l hor arguments on the »*t 
of perfect nqtjjtf en in binary. 
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even effect*, iheti we(n)/n ** {Vn)/n-> 0, Then, by Proposition I, if C b recutur 
it must he rreogniircd by on automaton with a dead slate. But this is impossible 
since every sequence ran be completed to produce a palindrome of twice its length. 
QE.D. One could, although it would be foolish to do so in this case, come to the 
stime conclusion by using Condition 2'. To do so, estimate the difference o* T i - a* 
between the nth palindrome and the neat. If a* is of even length, 2fc, it can be written 
a* ■ b n 2 + &, t where b % l< the sequence of digits of b m written in reverse order. If 
o. is of odd length, 24+1, it ha* the form a* - b*tf* 1 + 62' + 5. > whereJisOor 1. 
It is easy to see that 6**i is either fc* orfc.-H, sothat in each case a,.i - a* cannot 
be larger than the order of v'a* . It follow* that (o**i - d*)/a« — as n -* «. 

It is interesting to compare this with the (regular) set of sequence* of doublets, 
i.e., sequence* like 0011001 II 1001 1. This set ha* roughly the same sort of global 
distribution as the palindromes; hut the fine structure of its distribution of gaps 
causes it (rightly) to elude the Criterion. 

The following examples show how the Criterion can fail to yield useful informa- 
tion. 

Eiample 5. Let A(c) be the set of all power* of some fixed integer c. i.e., 
A{a) - (a*U fixed). Then log. (n> - J < *,<*) < log, (ti) so that r A (n) « 
log. (n)* Condition 1 i* satisfied. 

However. w*(n)fw A {\n) — • tog. it/log. Xn ■-• 1 as n -» », so Condition 2 fails, 
and sodoe* Condition 2 ; (Ami — a.)/a, » a — 1, The Criterion gives no informa- 
tion in this case. In fart A(2) is regular (for a binary machine), while -4(3) is not 
(ami vice vera* for n ternary machine), 

Exampfo 0. Periodic sequences such as 1 101, 101101, 101 101101, ->-j have 
w(n) « k log n so thai 



*<An) 
Condition I fails. {All such sets are regular) 

& Impossibility of a Corniest 

Let -4 be any infinite regular get with an infinite complement and let «>( j> be a 
incomputable function with values Oaurl I. Let ,4' be the subset of A defined by 
* € A f if .* € A and (* + 1) $ -4. A' is infinite. Let gin) be an enumeration of 
A\ without repetitions, and define .4* by the conditions: 

tl) If jt € {A - A%*$A*. 

(2) If 2 m g(n) $ A\ then put x in A' if*(ta) ™ 0; otherwise put z + 1 in A". 
It is clear that .1 * is not computable, and a fortiori not regular; otherwise 4(n) 
could be computed by observing a machine that recognizes ,4", since +(n) — 
if and only if g(n) £ A*. But * A (n) and «v(w) differ by, at the most, 1. Thus, 
while test* based on asymptotic density enn give evidence against regul hrity, they 
cannot give evidence* for regularity. 

0* Upper Bound of Oro Kth Rote of Regular Sets 

For the sake of completeness, the following more superficial result is include*]. 
Pkoiwition* 5. // A is rtgufar <xnd tnfimU, thar is some K > such thai 
*<*{i0 > K togn. 
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PiOOP. Suppose A is regular and that jl/* has .V states. Let A r be any integer. 
Then there i* mate z £ A with W# < t(x) < iV» 4- A r . To we this, note first thut 
if £<?> • !/) were dead for every y with £(p) * AT* t then d would be finite. So we 
tan rhoo*e a y with /-<y) = A r , and for which yjh ** -1 for some y t ■ But if y ft ts 
chosen to produce (he shorted path from 0(0* , p) to 3(g* , y-yi), then Ufa) < A\ 
Thus j- = yy % is in the staled range. 

It follows thai there is at least one member of A in each of the intervals ( 1, 2*) t 
(2*, 2*) ■■■ (2**, 2** l>v ) ■■ . Therefom, r,<2") > (, and hoace, * 4 (n) £ 
(to* n)/X = K log n. 

Es&mpU 7. It follows from Proposition A that sets sueh a* A « {2 1 |, which 
increase "fasier than exponential I >\" cannot be regular. 



7. Ifi&Hmon 

Many question* renmin. To what extent will the same kind of methods work on, 
say> pushdown machine*!* The criteria will have to change in detail (for example, 
the palindrome* would Ik- remgnisuble now), but we are inclined to suppose that 
the pushdown machines will aUr> fail to recognise the arithmetically interesting 
example*, and thai the gap »"<! density argument* can be rcfinod to ahow this. 
We are curious m to whether one ean ahow thut |ir | n is primc| i* not regular by 
much more elementary means. If not* this might suggest some nontrivial relation 
between automata theory and number- theoretical area*, such as the theory 
of rational approxim&l ions. 
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